Synthesizing Expressive Speech to Convey Focus using a Perturbation Model for Computer-Aided Pronunciation Training

نویسندگان

Fanbo Meng

Helen Meng

Zhiyong Wu

Lianhong Cai

چکیده

We present a perturbation model that can modify the acoustic features of neutral speech in order to synthesize focus for certain words. In doing so, we can generate expressive speech output that highlights important speech segments to attract the listener’s attention. The ultimate objective is to synthesize corrective feedback in a computer-aided pronunciation training (CAPT) system. This work involves the design and collection of a speech corpus, whose text prompts contain focus words. Each prompt is recorded twice – a neutral production followed by an expressive one where specific words are highlighted with focus. The phones in these recordings are modeled in six different classes, based on their relations with stressed syllables in focus words. Phone boundaries are obtained automatically by forced alignment with an automatic speech recognizer. Acoustic features of the phones, relating to f0, energy and duration, are extracted. Features that have highest correlation with the phone classes, as well as low variances, are incorporated into the perturbation model. The model is applied to neutral recordings of 20 test sentences. Results from a listening test show that the 13 subjects can identify the focus words with an accuracy of over 98%. The perceived degree of focus in the identified words achieves a mean score of 4.5 in a five-point Likert scale.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advantages of Using Computer in Teaching English Pronunciation

Pronunciation continues to grow in importance because of its key roles in speech recognition, speech perception, and speaker identity. Computer is being increasingly used in teaching English pronunciation to enhance its quality. The purpose of this paper is to discuss the advantages of using computer in English pronunciation instruction. Understanding the advantages of computer is an important ...

متن کامل

Automatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training

This paper presents a mispronunciation detection system which uses automatic speech recognition to support computer-aided pronunciation training (CAPT). Our methodology extends a model pronunciation lexicon with possible phonetic mispronunciations that may appear in learners’ speech. Generation of these pronunciation variants was previously achieved by means of phone-tophone mapping rules deriv...

متن کامل

Automatic Generation of Hypotheses for Automatic Diagnosis of Pronunciation Errors

This paper describes the use of a rule based system for generation of pronunciation variants as a component of a speech-enabled computer aided pronunciation learning (CAPL) system. This CAPL system is a part of a computer aided recitation of the holy Qur an training system. It generates the most probable pronunciation error hypotheses that are fed to a hidden Markov model (HMM)-based speech rec...

متن کامل

Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software

This study examined the impact of Clear Pronunciation 2 software on teaching English suprasegmental features, focusing on stress, rhythm and intonation. In particular, the software covers five topics in relation to suprasegmental features including consonant cluster, word stress, connected speech, sentence stress and intonation. Seven Iranian EFL learners participated in this study. The study l...

متن کامل

Developing Speech Recognition and Synthesis Technologies to Support Computer-Aided Pronunciation Training for Chinese Learners of English

Copyright 2009 by Helen Meng Abstract. We describe ongoing research in the development of speech technologies that strives to raise the efficacy of computer-aided pronunciation training, especially for Chinese learners of English. Our approach is grounded on the theory of language transfer and involves a systematic phonological comparison between the primary language (L1 being Chinese) and seco...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Synthesizing Expressive Speech to Convey Focus using a Perturbation Model for Computer-Aided Pronunciation Training

نویسندگان

چکیده

منابع مشابه

Advantages of Using Computer in Teaching English Pronunciation

Automatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training

Automatic Generation of Hypotheses for Automatic Diagnosis of Pronunciation Errors

Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software

Developing Speech Recognition and Synthesis Technologies to Support Computer-Aided Pronunciation Training for Chinese Learners of English

عنوان ژورنال:

اشتراک گذاری